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Abstract 

For the multiuser multiple-input multiple-output (MIMO) downlink channel, the users feedback 
their channel state information (CSI) to help the base station (BS) schedule users and improve the 
system sum rate. However, this incurs a large aggregate feedback bandwidth which grows linearly with 
the number of users. In this paper, we propose a novel scheme to reduce the feedback load in a downlink 
orthogonal space division multiple access (SDMA) system with zero-forcing receivers by allowing the 
users to dynamically determine the number of feedback bits to use according to multiple decision 
thresholds. Through theoretical analysis, we show that, while keeping the aggregate feedback load of 
the entire system constant regardless of the number of users, the proposed scheme almost achieves the 
optimal asymptotic sum rate scaling with respect to the number of users (also known as the multiuser 
diversity). Specifically, given the number of thresholds, the proposed scheme can achieve a constant 
portion of the optimal sum rate achievable only by the system where all the users always feedback, and 
the remaining portion (referred to as the sum rate loss) decreases exponentially to zero as the number 
of thresholds increases. By deriving a tight upper bound for the sum rate loss, the minimum number of 
thresholds for a given tolerable sum rate loss is determined. In addition, a fast bit allocation method is 
discussed for the proposed scheme, and the simulation results show that the sum rate performances with 
the complex optimal bit allocation method and with the fast algorithm are almost the same. We compare 
our multi-threshold scheme to some previously proposed feedback schemes. Through simulation, we 
demonstrate that the proposed scheme can reduce the feedback load and utilize the limited feedback 
bandwidth more effectively than the existing feedback methods. 
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I. Introduction 

Multiple-input multiple-output (MEMO) technologies can provide spatial diversity in wireless 
fading channels to improve the communication quality. In particular, recent studies have shown 
that the sum rates of MIMO systems can be increased when the base station (BS) communicates 
with multiple users simultaneously [lj. For the downlink broadcast channel employing multiple 
antennas, it has been shown recently that dirty paper coding (DPC) [2 J achieves the capacity J3]|. 
However, this capacity achieving scheme is difficult to derive and has a high encoding/decoding 
complexity. Thus, several works resorted to the more practical (but suboptimal) space division 
multiple access (SDMA) based designs. For example, zero-forcing beamforming (ZF-BF) was 
shown in [4] to achieve the optimal sum rate growth. However, both the DPC and the ZF schemes 
require perfect channel state information (CSI) feedback from the users to the BS to achieve the 
optimal performance. This may result in high feedback load and is not practical. 

In [[H, 10, a model was proposed to analyze the sum rate loss due to imperfect (quantized) 
CSI. In the system considered there, each user quantizes the channel vector to one of the N = 2 B 
quantization vectors and feeds back the codebook index using B bits to the BS to capture the 
spatial direction and magnitude of the channel. To reduce the feedback load, the orthogonal 
random beamforming (ORB) scheme [|7) can be used. In the ORB scheme, the BS transmits 
through orthogonal beamforming vectors to the users, and each user only needs to feedback 
its received signal-to-interference-plus-noise ratios (SINR) on different orthogonal beamforming 
vectors for the purpose of scheduling. It was shown in [7] that the ORB exhibits the same sum 
rate growth as the DPC and the ZF-BF based schemes when the number of users is large. 

There are other previous works that sought to reduce the feedback load at the scheduling 
stage. In j8]|, a threshold was set according to the scheduling outage probability such that a user 
did not need to feedback when its CSI is below the threshold. This method reduces the system 
feedback load without affecting the scheduling performance much. In 0, multiple thresholds 
were set, and the scheduler utilized a polling process to select the best feedback threshold from 
these thresholds to further reduce the aggregate feedback load. However, the drawback of this 
scheme is the large delay incurred by the polling process. In iflOl . another scheme was proposed 
to reduce the feedback load of the ZF-BF systems through two-stage feedback. In the first 
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stage, each user feeds back the coarsely quantized version of its CSI, and thus the BS has some 
information to determine which users to schedule. The BS then broadcasts to the scheduled users 
and asks them to feedback finer CSI to achieve good ZF-BF performance in the second stage. 
The drawback of this scheme is also the delay incurred by the two-stage feedback process. 

From the above discussion, it is clear that the feedback load of multiuser MIMO systems can be 
reduced if the scheduling mechanism is taken into consideration. However, most existing works 
only use the scheduling mechanism to control the amount of feedback, but not incorporate the 
properties of scheduling into the CSI quantization design. In view of this, in this paper we propose 
to reduce the feedback load by incorporating the scheduling mechanism in both the determination 
of the amount to feedback and the CSI quantization. The proposed scheme divides the range of 
CSI into multiple regions according to the order statistics of the received signal-to-noise ratio 
(SNR) which reflect the properties of scheduling. Each region corresponds to a range of SNR, and 
is quantized with a specific number of bits to further assist scheduling and link adaptation. The 
CSI feedback thus consists of two parts: one indicating the region that the received SNR falls in, 
and the other being the quantized result of that region. For a given number of regions, we derive 
a tight upper bound for the sum rate loss of the proposed scheme as compared to systems with 
perfect CSI feedback from all users. Then, for any given tolerable sum rate loss of the system, 
the minimum number of regions required is derived. For example, the proposed scheme with 
four regions is good enough to keep the sum rate loss smaller than 0.25 bps/Hz for the number 
of users less than 100. In addition, the aggregate feedback load and the multiuser diversity using 
the proposed scheme are also investigated. Our theoretical analysis shows that, in contrast to the 
existing feedback schemes whose aggregate feedback loads increase with the number of users, 
with a given number of regions, the proposed scheme has a constant feedback load regardless of 
the number of users. Moreover, while keeping the feedback load constant, the proposed scheme 
almost achieves the optimal asymptotic sum rate scaling with respect to the number of users 
(that is, the multiuser diversity). Specifically, given the number of regions, the proposed scheme 
can achieve a constant portion of the optimal sum rate achievable only by the system where all 
the users always feedback, and the sum rate loss decreases exponentially to zero as the number 
of regions increases. Through simulation, we verify these analytical results, and demonstrate that 
the proposed scheme can reduce the feedback load and utilize the limited feedback bandwidth 
more effectively than the existing feedback methods. A fast bit allocation method that assigns 
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different numbers of quantization bits to different regions is also discussed. The simulation results 
show that the sum rate performances with the complex optimal bit allocation method and with 
the fast algorithm are almost the same. 

Note that the required information for the proposed scheme to operate, such as the SNR 
statistics and the number of users, are usually known at the BS. Thus, in practices, the BS can 
compute the region thresholds and broadcast to the users periodically, or broadcast the parameters 
of the SNR statistics and the number of users periodically to the users to let them derive the 
thresholds. 

The remainder of this paper is organized as follows. Section HI describes the system model 
and briefs the order statistics. Section ITTT1 introduces the proposed feedback scheme and analyzes 
its sum rate loss and multiuser diversity. In Section [V] the bit allocation problem is discussed 
along with the feedback load analysis. We then give the simulation results in Section |V] and 
conclude the paper in Section IVT1 

Notation : Vectors and matrices are denoted by boldface lower case and capital letters, respec- 
tively. E{-} refers to expected values of a random variable. X T (x T ) stands for the transpose 
of matrix X (vector x), and X* (x*) stands for the conjugate transpose of matrix X (vector x). 
Moreover, X^ denotes the pseudo-inverse X*(XX*) _1 . The function \x] represents the smallest 
integer > x. log and In are the logarithms with base 2 and e, respectively. 

II. System Model 

The multiuser MIMO downlink system model is shown in Fig{T] where the BS is equipped 
with M t antennas. There are K users in the system and each user has M r receive antennas. 
We consider a full buffer traffic model, that is, each user always has data in the buffer to 
transmit. According to the ORB strategy for multiuser transmission, the BS uses a precoding 
matrix W = [wi, w 2 , . . . , w Mt ], where Wj G C Mt ,i = 1, 2, ... , M t , are random orthogonal 
vectors generated from isotropic distribution [QT|. The received signal at the k-th user can be 
mathematically described as: 

y k = H fe Ws + n fc , (1) 

where is the M r x M t complex Gaussian channel matrix between the BS and the A;-th user, 
rife is the M r x 1 additive white Gaussian noise (AWGN) vector at the £>th user. The entries of 
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Fig. 1. Multiuser MIMO downlink system model. 



Hfc and are assumed to be independent identically distributed (i.i.d.) complex Gaussian with 
zero mean and unit variance. In addition, the channel matrices for different users are assumed to 
be independent. Note that in this paper we consider only identical channel distributions for the 
users for the simplicity of demonstrating the idea. The more practical situations where the users 
have different channel statistics or distributions are more intricate, and are discussed in [|T2l . 
The vector s = [si, s 2 , . . . , sm 4 ] T is the M t x 1 vector of the transmitted signal. It is assumed 
that the feedback channel is error-free and delay-free. The total transmitted power is a constant 
P t so that E{s*s} = P t . Under the equal power assumption of the ORB, each beam is equally 
allocated with power p = P t /M t . 

We consider zero-forcing receivers. The received signal after the zero-forcing filter is given 

by 

(H fe W)Vfc = s + (H fc W)V (2) 

Therefore, the received SNR of the m-th signal s m at the A;-th user with the ZF receiver is given 
by 

SNRm > k = [((H fc W)*(H fc W))"i] m (3) 
where [A] m denotes the m-th diagonal element of matrix A. Assuming that M r > M t , it is well 
known that SNR m! k is a chi-square random variable with 2(M r — M t + 1) degrees of freedom 
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F x m M = 1 777 777i — , Vm, k, (5) 



[fT3l . For simplicity, we let X m)k = SNR m ^ and then the probability density function (PDF) of 
X mik can be expressed as 

Consequently, the cumulative distribution function (CDF) of X m> k is given by 

r( M r -M t + i, f) 

(M r -M t )! 

where T (a,x) = / 0C i B - 1 e-*dt is the upper incomplete gamma function. According to ©, when 
the transmitter and the receiver have the same number of antennas, X m fc has an exponential 
distribution with parameter 1/p. For simplicity of the derivation, we will consider this M r = M t 
case for the ORB system. Extension to the other cases is straightforward, but the mathematical 
expressions are more complicated. 

We consider that the maximum sum rate scheduling algorithm is employed at the BS. That is, 
on each beam direction, the BS selects, among the users who have fed back their CSIs, the user 
that has the best channel to transmit to. If none of the users has fed back the CSI, the BS randomly 
selects one user to transmit to. Due to the symmetric property, we drop the direction index m of 
X m> k, and let Xk represent the SNR of user k for a ceratin beam. Let X^,X^, . . . , X? K y be 
the order statistics of i.i.d. continuous random variables Xi, X 2 , . . . , Xk, with the common PDF 
© in decreasing order, i.e., X£ > X& > ■ ■ ■> XfL. The PDF and CDF of respectively, 
are given by [fl4|: 

F x k } (x)= ^ (i)\(K-i)l >-oo<x<oo, (6) 

i=K-{j-l) W V ; 

f f , K\f Xm] {x){F Xm] {x)} K ^{l - F Xmj (x)y- 1 

fx uS x) = ( / i 7\ /j! ^ ' -°° < X < °°- (7) 

With the order statistics, the sum rate using the maximum sum rate scheduling algorithm can be 
computed. As a simple example, if every user has the probability Pj to feedback the CSI for a 
particular beam direction to the BS, and the feedback events are independent of the value of the 
CSI and independent from user to user and from beam to beam, the sum rate can be obtained 
by 

{ K k^p n n — p \ K ~ n ] 
E ' n!(AT-n)! l0g(1 + X « } + (1 ~ ^ l0g(1 + Xk) ) (8) 



DRAFT 



7 



where the second term is the rate when no user feeds back to the BS and the BS randomly 
schedules one user fcona ceratin beam. 

III. The Multi-threshold Feedback Scheme 

For the scheme in [8], if the SNR of a user is greater than the outage threshold, the user feeds 
back Bq bits to represent the received SNR. Otherwise it does not feedback. The threshold is 
derived according to a pre-determined scheduling outage probability (where "scheduling outage" 
refers to the situation when none of the users feeds back), but not directly related to the scheduling 
mechanism. Since the maximum sum rate scheduler selects users according to their SNR orders, 
it is more meaningful to set the threshold according to the order statistics of the received SNR. 

The basic idea of our proposed scheme is to let a user compare its received SNR with the 
thresholds derived from the order statistics. The user can thus guess its most possible rank 
among all the users, and, if its rank is high enough to make its chance to be scheduled high, 
it feeds back its SNR. Otherwise the user does not feedback in order to save the reverse link 
resource and avoid interfering the other users' reverse link transmission. Note that there might 
be errors in the statistical inference by the individual users about their SNR ranks. These errors 
may result in the situation where the users who actually have high SNRs do not feedback, and 
the BS does not have proper users to select from. To make up for the sum rate loss due to 
this situation, we allow the users with several (guessed) ranks to feedback. Therefore, for each 
beam direction, a set of iV thresholds R th = {r t h,i, r t h,2, • • • , r t h,N} is set (see Fig. O according 
to the order statistics of the received SNR. Let r t h,o — 00 • F° r SNR region i bounded by 
the adjacent thresholds as [r t h,i, r t h,i-i), h additional quantization bits are used to help the BS 
differentiate users whose SNRs fall in that region, and make better link adaptation. According 
to the importance of the SNR regions to the sum rate, 6j, i = 1, . . . , N, (to be optimized later) 
are usually in non-increasing order. When the received SNR is higher than r t h,N, the user feeds 
back its rank and the additional quantization bits. Otherwise the user does not feedback at all. 

A. Derivation of Multiple Thresholds 

1) i.i.d. case: When user k has on its m-th beam SNR mjk = snr myk and the users have i.i.d. 
SNR distribution, the probability that user k's SNR on the m-th beam direction is ranked the 
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Fig. 2. Multi-threshold feedback model. 



p-th among all the users is 



P{X k = X^\X k = snr m>fe } 

(K - l)\{F Xm!k (snr m>k )} K -P{l - F Xm k {snr m , k )Y^ 
(K-p)\(p-l)\ 



and satisfies 



A' 



(9) 



J2 P{Xk = X K p) \X k = snr m , k } = 1. (10) 
P =l 

where Fx mk (x) is defined in ©. 

For example, the probability that user k has on its m-th beam SNR mtk = snr mjk which is 
the highest SNR among all the users on the m-th beam is 

P{X ( f } = snr m , fc } 



{Fx mik (snr m>k )) 



K-l 



(ID 



P{X k = snr m , k } 

With SNR m:k = snr m)k , user k can infer its most possible rank among the users on the m-th 
beam as 



rank(snr m k) = arg max P{X k = X { ) \X k = snr mjk }. 

p—l,...,K ' 



(12) 



2) non-i.i.d. case: In practical systems, the users may be at different distances to the BS. 
Thus the CDFs of the users' SNRs F Xmj (x),j = 1,2, ... ,K, may not be identical as in ©. 
Assuming that the users' SNRs are independent, the probability that user fc's SNR on the m-th 
beam direction is ranked the p-th among all the users becomes 



P{X k = X {p) \X k = snr mtk } 

{K-p K-l 
i[Fx mM3) (snr m , k ) J] 
j=l j=K-p+\ 



1 " F * m ,sU) ( STlr m,k, 



(13) 
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where S is a set of permutation functions defined by 



S 



{s :{!,..., K — 1} ->{!,..., k-l,k + 1,...,K} \ s(-) invertible, 



s(l) < - • • < s(K - p) and s(K -p + 1) < • • • < s(K - 1)} , 



where the last two conditions are to avoid multiple counting of the same combination. The most 
possible rank of user k can again be found using (fl"2l) . 

To this end, the regions in Fig. |2] are defined such that for all the SNR values in region j, 
i.e., V snr m ^ k G [rthj, fth,i-x), rank(snr m:k ) = j. Thus the number of regions N must be no 
larger than the number of users K. The corresponding thresholds r t hj,j = 1,2, . . . , N, can then 
be determined accordingly. All the thresholds can be computed off-line as long as the number 
of users and the channel statistics are known. The values of the thresholds can be updated 
periodically according to the system configuration and channel statistics possibly broadcasted 
by the BS. 

In this paper we will consider only i.i.d. SNR distributions for the simplicity of demonstrating 
the idea. The non-i.i.d. case is much more intricate, and is handled separately in |fl2l . If a user 
finds its SNR on a beam lower than r t h ,n, then no feedback is sent for that beam. Otherwise, 
the user feeds back Br = |~log 2 (iV)] bits to indicate its most possible rank on that beam. In 
order to account for the situation where there are more than one users reporting to have the 
same rank, each region j is further quantized with bj bits which are also fed back together with 
the "rank" bits. 

Due to the symmetric assumption that the users suffer i.i.d. Rayleigh fading processes, also 
due to the ORB, the same set of thresholds applies to all users and all beam directions. Since 

rank(snr myk ) = j when snr m>k G [r thJ ,r thyj _i), rank(snr m)k ) = j + 1 when snr m ^ k G 
[rth,j+i,r t hj), and the probability P{X k = X^\X k = snr mfc } in © is a continuous function 
of snr mjk for p = 1,2, ... ,K, using © and CGI) we have 




(14) 
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Then 

P{X k e[r thJ ,r th ^ 1 )} = ^, j = l,2,...,N. (15) 

In other words, the probability for a user to infer itself as being ranked the jth place on a certain 
beam direction is ^, for all j = 1, 2, . . . , N, with N < K. This is very intuitive because each 
user has the same probability of being ranked the jth place, j = 1,2, ... ,K, due to the 
symmetric assumption of the users' SNR distributions. Therefore, the probability Pf for a user 
to feedback is 

r th,N N 

Pf = l- Fx m , k (r th , N ) = e~— = -. (16) 

B. Minimum Number of Regions and Sum Rate Loss Analysis 

One important issue regarding the multi-threshold design is the number of regions to be 
applied. The number of regions affects the sum rate loss. If the number of regions increases, 
which results in higher feedback load, the sum rate loss will decrease. Thus, for a given tolerable 
sum rate loss, we should apply the minimum number of regions required to minimize the feedback 
load. 

Let the rate loss event be defined as when all users' SNRs on a certain beam is smaller than 
the threshold r thyN , and the BS randomly schedules one user because none has fed back. Note 
that this event is called the scheduling outage event in JS). The probability of the rate loss event 
is then 

„ / r th,N \ K 

P L = P{X^ < r thjN } =(1- e- — ) . (17) 

Without loss of generality, assume that user k is selected by the BS in a rate loss event. The 
sum rate loss compared to the case when the users always feed back is 

AR(K, N) = M t E {log(l + Xfc) - log(l + X k ) | Xfc < r th , N } P L 

< M t E {log (1 + (Xfc - X k )) | Xfc < r th , N } P L 

< M t \og(l + M{(X^-X k ) | Xfc < r th , N }) P L 

4 ARu(K,N), (18) 
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Fig. 3. Sum rate loss versus the number of users. ARp(K, N) = 0.25 bps/Hz, p = 10 dB 



where the inequalities are due to the convexity of the rate function and Jensen's inequality. By 
using (fT4b . 



E{(Xfc-X k ) \X^<r thiN } 



rth.N 



■ i£ / i£ 

xK-e p 1 — e p 
p 



JST-1 



1 - e 



r th,N \ K 



K ^-4 (—iy(K— 1)! 



-da; 



r t h,N x- p e p 1 1 — e" 



Af\ c P 



:i-f)^ 



pi 



MHf) 
(i-l) 



(19) 



where c 



;i+o 



. Using (|T8l) and (1191) . the minimum number of regions required for a given 



tolerable sum rate loss ARp(K) can be effectively approximated by comparing the sum rate loss 
upper bound ARu(K, N) with AR P (K). Fig. [3] compares the sum rate loss upper bound with 
the actual sum rate loss obtained by simulation, and shows that the sum rate loss upper bound 
(TTiST) with (fT9l is tight (within 0.1 bps/Hz) when the number of regions is large. This figure also 
shows that four regions are enough to keep the sum rate loss smaller than the tolerable sum rate 
loss ARp(K) = 0.25 bps/Hz when the number of users is less than 100. 

Another important issue is how much the sum rate can be increased when the number of 
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regions is increased by one. By ([141 ) and (fTTT ), when the number of regions increases from iV 
to iV + 1, the probability of the rate loss event is reduced by 



K -N \ K ( K - iV - 1 x h 



Pi = P{r th , N+ i < X* < r th , N } = I — — ) -[ - J • (20) 

Note that when r t h,N+i < < r t u,N, the system with N+l regions will schedule the user with 
the highest SNR, while the system with iV regions will randomly schedule a user. Otherwise, 
the two systems have the same scheduling operations. Without loss of generality, assume that 
the randomly scheduled user is user k. When the number of regions increases from N to N+ 1, 
the sum rate increment AR in (K, N) can be upper bounded by 

AR m (K,N) = M t ~E {log(l + X ( f)) - log(l + X k ) \ r thiN+1 < < r th ^ N ) Pj 

< M t E{\og(l+(X^-X k )) \r tKN+1 <X« ) <r tKN }P I 

< M t log(l+E{(X^-X k ) \r tKN+l <X* ) <r tKN })P I 

4 AR injU (K,N), (21) 

where 

1 f rrth.N 1 

E{(X* - X k ) \r th>N+1 < Xfc < r tKN } = — I xK-e~{l - e~) K ~ x dx 

I Jr th>N+ i P 



r th,N+\ j. x r r th,N j. 

PA—e~~pdx + / Pb~ &~~pdx 

P Jr th , N+1 P 

1 f K ^ (-lf(K- 1)! 



t!(^-t-i)! [{crth > N+1 + 1] e ~ Crm - {crth > N + 1] e ~ Crth ' N] 

/ r r th,N+l 1 r r th,N+l r th,N "I \ 1 

( Pa \p - (p + r th ,N+i) e ? + P B (p + r t h,N+i)e " -{p + r t h,N)e p Jj 



- ( P 
where 

P A = [(1 - e -*th,*)K-l _ (1 _ 

p B = (l _ e - Ar *^) 
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Fig. 4. The sum rate increment when the number of regions is increased from N to N + 1. p = 10 dB, _K"=100 

Furthermore, ARi n (K, N) can be lower bounded by 

AR m (K, N) = M t E {log(l + Xfc) - log(l + X k ) \ r th , N+1 < Xfc < r th>N } P l 

> M t [log (1 + r th , N+1 ) - E {log (1 + X k ) | r tKN+l < Xfc < r th , N }] P r 

> M t [log (1 + r th , N+1 ) - log (1 + E {X k | r thjN+1 < Xfc < r th , N })] P/ 

4 ARi ntL (K,N). (22) 

Fig. |4] compares the simulated sum rate increment with the upper bound (|2T|) and lower bound 
(1221) for different numbers of regions N when the number of users is K = 100. As in Fig. [31 
the bounds become tighter when iV is large. In addition, the sum rate increment gets smaller as 
the number of regions gets larger. Eventually the sum rate increment will approach zero (when 
N = K, the users always feedback and the sum rate increment is exactly zero). 

Fig. \5\ shows the simulation results of the sum rate for different numbers of regions when the 
number of users increases. It is shown that the sum rate increases with both the number of users 
and the number of regions. Both the sum rate loss and the the sum rate increment decrease with 
the number of regions as already shown in Fig. |3]and Fig. |4] respectively. When the number of 
regions is larger than four, the sum rate achieved by the multi-threshold scheme is very close to 
the sum rate with full CSI. 
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Fig. 5. Sum rate performance of the multi-threshold scheme with different numbers of regions, p = 10 dB. 



C. Multiuser Diversity Using the Multi-threshold Scheme 

In this section, we characterize the asymptotic sum rate scaling of the multi-threshold feedback 
scheme with respect to the number of users, that is, the multiuser diversity. The sum rate using 
this scheme can be expressed as 



R(K, N) = M t P{r th , N < Xfc < oo}E {log(l + Xfa) | r th>N < Xfc < 00} 
+ M t P{0 < Xfc < r th , N }M {log(l + X k ) I < Xfc < r th , N } . 



(23) 



When the number of users is large, this sum rate exhibits the following property. 

Theorem 1: Let M t , p, N be given, and the lowest threshold r thyN = p\n(K/N). The 

achievable sum rate R(K, N) of the multi-threshold feedback scheme satisfies 

Um R { K,N) =1 _^ 
K^oo M t log log K 

Proof: The sum rate can be lower bounded by 

R(K, N) > M t P{r th>N < X K 1} < oo}E {log(l + Xfc) \ r th , N < X K 1} < 00} 

> M t (l-P L )log(l + r thiN )±R L (K,N) (24) 
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where Pl is the probability of the rate loss event defined in (TTVT ). For Rl(K, N), we have 
lim Rl(K,N) Hm (l-P L )M t \og(l + r th , N ) 

K^oo M t log log K K^oo M t log log K 



A'-s-oo \ Y K J I log log if 

On the other hand, using Jensen's inequality, the sum rate iV) can be upper bounded by 

R(K,N) < M t (l-P L ) {log(l + E{X ( * | r t/l>JV < X? x) < oo})} 

+M,P L log(l + E{X fe }). (25) 

According to [fT5l . for i.i.d. random variables Xi,X 2 , . . . , Xk having the same CDF F x (x), 

[ F x \u)du < E {Xfc} < K [ F x \u)du. 

Jo A-i 

Thus, E jx^ | r th<N < X^ < oo| can be upper bounded by 

E {Xfc | r th , N < Xfc < oo} < E {X ( f } | r th , N < X k < oo, fc = 1, 2, . . . , K) 
<K [ F~ 1 (u)du = K [ -p\n((l-u)e- r ^)du 

= K I [F x l k (u) + r thjJV ) du = p (ln(K) + 1) + r th>N (26) 

where Fx\x>r th N {%) is the conditional CDF of X, which is distributed like Fx mk (x) defined in 
©, given that X > r t h,N- That is, 

e -z/p 

Fx\X>r th N {x) = 1 " 7-. 

Substituting (1261) and E {X fc } = p into (1231) . an upper bound of the sum rate Ru(K, N) can be 
defined as 

R(K, N) < M t {(1 — P L ) log (1 + p (m(X) + 1) + r fh(JV ) + Pl log(l + p)} = Pt/(A, AO- 
For the upper bound Ru(K, N), we have 
lim R u(- K > N ) = lim f (jz Pl ) log (l±g ( ln ^ + X ) + rth ^ \ + l im ^ lQ g(l + P) 



A'^oo Mt log log K AT-»oo \ log log K J A^oo log log K 



= lim 1-1 



N\ K \ \og(l + 2pln(K)+ p-ln(N)) 



K->oc \ \ K J J log log K 



1 - e~ N . 



DRAFT 



16 



The proof is complete. ■ 
This theorem shows that with given M t , p and N, when the number of users K is large, the 
multi-threshold feedback scheme can achieve a sum rate which scales like (1— e~ N )M t log \og(K). 
In other words, this scheme can asymptotically achieve a constant portion (1— e~ N ) of the optimal 
sum rate M t log log (if) achievable with full CSI feedback. The remaining portion, i.e., the sum 
rate loss, decreases exponentially to zero as the number of regions iV increases. This result can 
be observed from Fig. |5] where it is shown that the sum rate loss is already very small when the 
number of regions is four. 

IV. Bit Allocation and Feedback Load Analysis 

In this section, we consider practical quantization and feeding back the CSI values with finite 
numbers of bits. Assume that the users use B R bits to represent the region information, and 
additional bj bits to quantize the SNR when it falls in region j. On a given beam, whenever 
there are other users feeding back the same rank indication and the same additional quantized 
bits as the user who actually has the highest SNR, the BS will randomly schedule one of them. 
As a result, the lowest possible rate due to this ambiguity in scheduling will be the rate derived 
from the lower boundary of the SNR quantization region in question. Let B = 6 2 , ■ ■ ■ , &zv) 
be the vector of the numbers of bits for quantizing the SNR in regions 1,2, . . . , N, respectively. 
We assume the optimal nonuniform quantization |[T6l [fTVll for each region. The sum rate R q (B) 
with both rank and SNR quantization feedback can be lower bounded by 

R q (B) >M t J2J2 " lo § + r '^') %j ^ dx - ^ 
j=i t=i J r th,j,t 

where r t h,j,u r th,j,2, • • • , r th . ji2 b i+i are the quantization levels in rank region j, with r th ,j,\ = r thJ , 
r thj 2 6 j+i = r th,j-i- Fig- shows that the analytical lower bound of the sum rate ((271) almost 
matches the simulation result when B = (0, 0, 0, 3). Thus, the bound (T27T) is very tight. 

We now discuss how to allocate available bits to quantize each region to achieve the maximum 
sum rate. Because all users have the same probability i of inferring itself as being ranked the 
jth, j = 1,2, ■ ■ ■ ,N, on each beam direction, the expected number of feedback bits required 
in addition to the rank bits is YljLi k 7- Dropping the constant, the bit allocation problem of N 
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Number of users 

Fig. 6. Sum rate comparison between the mathematical lower bound and the simulation result with B = (0,0,0,3), p = 10 
dB. 



regions for a given beam m becomes 

max.R (B) 

B 

TV 

s.t^2bj = B Q , bjeZ + . (28) 

A. Optimal Bit Allocation 

The problem (|28T > can be solved by the greedy algorithm, which is to assign one bit at a time 
to the region that will result in the maximum sum rate, because adding one more bit to any of 
the regions will increase the average feedback load by the same amount. 

For the s-th single bit assigning iteration, the sum rate difference between using bi iS -i bits 
and bi )S -\ + 1 bits for region I can be expressed as 

A R\s) = Ri l 3 _ i+1 - R l h ^ , I = 1, 2, . . . , N, (29) 

where is the number of bits for quantizing region I, resulting from the (s— l)th bit assigning 
iteration. R l m is the sum rate of region I using m quantization bits, and can be approximated 

by M t Y^j=\ J Tth ' l ' 3+1 log (1 + r t h,i,j) fx K (x)dx. The region which gives the maximum sum rate 

■J th,l ,j ( 1 ) 

increment with one additional bit will be assigned one more bit at the s-th iteration. The algorithm 
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iterates until all the available quantization bits are allocated, i.e., when s = Bq. The greedy 
algorithm is summarized in Table [I] 

TABLE I 

The greedy algorithm for bit allocation. 

Initialize bi = 0, I = 1, 2, . . . , N. 
For (s = 1 to Bq) 

I — arq max A R (s) 

" t=l,2,...,N V ' 

bi = h + 1 

End 



B. Fast Bit Allocation Method 



The sum rate formula (1271 ) is difficult to compute. We alternatively consider minimizing the 
mean square quantization error as a suboptimal but simple solution. The conditional PDF of the 
SNR in region j for a given beam m is 

fx m>h (x I r thd <x< rthj-i) = —- r = — e » , r thJ <x< r tKj _ x . (30) 

" Vth,j S ^ < rthj-l) P 

The SNR variance in region j can be expressed as 

/ r r th,j-i K _x \ ( f" ■ K _ £ 
a 2 = ( / — e t>dx 1 — I / x — e prfx 

" J V Jr thlj P J V ^ P 

Thus, the variance of the quantization error using 5, bits can be bounded by [fT8l 

2 

2 < 2°J^i (31) 

where the constant e is source dependent. For example, e = 1.0 for uniform distributed sources 
and e = 2.17 for Gaussian sources. In our case, the SNR PDFs for different regions are different, 
thus the e that gives the tightest bound (13TI) will be different for different regions. In order to 
simplify the computation, we set the same e for all regions such that the upper bound (|3TT ) is 
always valid. Note that this simplification is reasonable when K is large such that the SNR 
distribution is almost uniform in all regions. We further relax the constraint for the number of 
quantization bits in (|28l) from being a positive integer to being a positive real number. Then, a new 
bit allocation problem based on minimizing the upper bound of the variance of the quantization 
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error can be formulated as 



N 2 
^- K u L ,...,u N , . 2 b i 



Ef=i ^ = B Q 
0<b 3 <B Q , j = l,2,...,N , 



s.t. < 



(32) 



where in the objective function, the same constant e and same probability 1/K for all regions 
are dropped for conciseness without changing the problem. Since the optimization problem (|32l 
is convex, we can apply the Karush-Kuhn-Tucker (KKT) conditions lfl9l to solve it. To simplify 
the expression in (l32l . we let 



N 



N 



L(B, A, ^i, ... , u N , 5i,..., 5 N ) = / (B) + A/ X (B) + £ i/^B) + ]T %(B) 



(33) 



j'=i 



where 



/o(B) = £jli<,-2-* 

/i(B) = Ef=i^"^ 
^•(B) = -&,■ 



I ?i(B) 



5, 



Since / c , /i, fyj, are differentiable, the KKT conditions for this problem are 



dL(B,A, i/!,..., i/jy, <?!,••• 



AT J 



0, j = 1,2 



,...,iV, 



A ^ 

^•(B) = O^-g^B) = 0, 



(34) 



i/j- > 0, Sj > 0, 



J = 1,2 
J = 1,2 



,...,N, 



N. 



From 



dL(B,X, u 1 ,---u N ,5 1 ,---6. 



N , 



= 0, we have 
-21n2)2- 2 ^o-2 , + x + Si. 



(35) 



Substitute (1351) into the fourth condition in (l34l . By considering the {vj = 0, <5j = 0, < bj < 

Bq}, {uj = 0,5j > 0, bj = Bq] and {vj > 0,5j — 0,bj =0} cases separately, and defining W 
as 



(36) 
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ln(0*4)*a ) 



where Tj = ^ ^ x a and V = ^j^ 4 , 6j can be obtained by 

0, W > Tj 



(Tj - W) In 10 



Tj - V < W < T (37) 
In 4 J J 

B Q , W <Tj- V. 

The obtained bj's are then rounded to be nonnegative integers. Through simulation, we found 
that when Bq is sufficiently small (Bq < 3), the optimal bit allocation has the form B = 

(o,o,... ,o,b q ). 

C. Feedback Load Analysis 

Let Br be the number of feedback bits carrying the rank information and Bq be defined 
in (|28l ). For the multi-threshold feedback scheme, the average number of feedback bits for the 
network when the number of users is K can be expressed as 

F b = KM t J2\ k(Br + h) \ = M t (NB R + B Q ) (38) 

which does not increase with the number of users, and is a constant when the number of 
transmission beams M t and the number of regions iV are fixed. This is in contrast to the 
conventional feedback schemes whose total feedback load for the network increases with the 
number of users. 

V. Numerical Results 

In this section, we compare different feedback schemes in terms of the sum rate and feedback 
load performance using simulation. The transmitter is equipped with M t = 4 antennas and 
there are K users each having M r = 4 antennas. For the conventional feedback scheme, named 
Scheme A, each user always feeds back to the BS the SNR values of the M t beams. A reduced 
feedback scheme was proposed in [|20l where each user only feeds back its largest SNR value 
among all beams and the corresponding beam index. We refer to this scheme as Scheme B. 
The feedback loads of both Scheme A and Scheme B increase with the number of users. The 
multi-threshold scheme we propose is referred to as Scheme C. The single threshold feedback 
scheme proposed in (8) will be called Scheme D. For that scheme, each user feeds back the 
SNR value of a beam direction when the SNR is greater than the threshold. In [8], the threshold 



DRAFT 



21 



is determined by the scheduling outage probability P out which is the probability that none of the 
users feeds back. In the performance comparison, we additionally introduce a slightly modified 
Scheme D based on the design philosophy proposed in this paper by setting the threshold as 
r th,N of Scheme C, such that the P oui of this scheme equals to the probability of rate loss 
event P L of Scheme C in (fT71) . Thus, in the comparison, we will consider Scheme D with 
constant P out = 10~\ 1(T 4 , and P out = P L ={l- £)* '. 

In the simulation, Scheme A and Scheme B use Bq^a and Bqb bits, respectively, to 
optimally quantize their SNR values. Scheme C has Bq bits allocated to iV = 4 regions 
using the fast bit allocation method in Section HV-Bl The number of regions is chosen to 
guarantee the sum rate loss upper bound in (Q~8} smaller than the system tolerable sum rate loss 
ARp(K) = 0.25 bps/Hz. Note that the bit allocation of Scheme C depends on the number 
of users. For Scheme D, Bqd bits are used to optimally quantize the region [threshold, oo) 
where the threshold depends on the scheduling outage probability Pout- 

Fig. I7J compares the sum rates of different feedback schemes as the number of users increases. 
The numbers of SNR quantization bits defined above for different schemes are set as five. Note 
that for different schemes, the relationships between the number of SNR quantization bits and 
the total feedback load are different. Therefore, Fig. |7J is shown only to illustrate the perfor- 
mance difference between similar schemes. With the same number of SNR quantization bits, 
Scheme A's total feedback load is roughly four times that of Scheme B. Thus Scheme A's 
sum rate is higher than that of Scheme B, with the sum rate difference getting smaller as the 
number of users increases. This is because when the number of users is large, feeding back only 
the largest SNR among all beams is good enough for the purpose of scheduling. For Scheme D, 
setting the threshold such that P out = 10~ 4 results in higher sum rate compared to setting the 
threshold as r tfe 4 when the number of users is large. This is because the scheduling outage 
probability of the latter increases with the number of users, and is higher than that of the former 
when the number of users is large. With Bq = 5, Scheme C's average number of feedback bits 
is M t (NB R + B Q ) = 52 which is less than M t NB Q<D = 80 of Scheme D using the threshold 
fth,i Pout = Pl)- Thus Scheme C's sum rate is lower than that of Scheme D. 

Fig. [8] shows the average total feedback loads of the cases considered in Fig. |7J and confirms 
the above discussion on the numbers of feedback bits for similar schemes. For example, the 
feedback loads of Scheme A and Scheme B grow linearly with the number of users, and the 
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slope of Scheme A is four times that of Scheme B because Scheme A's users feedback the 
SNR of every beam. Scheme C and Scheme D with threshold r th j4 have constant feedback 
loads as discussed in Section HV-Ci On the other hand, Scheme D with constant scheduling 
outage probability (P out = 10 _1 , 10 -4 ) has its feedback load increasing with the number of users, 
but saturating when the number of users is high. This is because when P out is fixed, Scheme D's 
threshold is — pm(l — Pli^)- When the number of users is large, Scheme D's feedback load 



is linift' M t E>QK \ l — Pj^ j = M t BQ\n(l/P out ). Thus the feedback load behaviors of the 
three Scheme Ds are similar when the number of users is large. 

For fair comparison between the feedback schemes, the results of Fig. [7J and Fig. [8] are 
combined to show the sum rate as a function of the feedback load in Fig. HI That is, the sum 
rate of each simulation case in Fig. [7] and its corresponding feedback load in Fig. [8] form a data 
point in Fig. [9] As shown in Fig.[9l for Scheme A and Scheme B, the feedback load has to be 
increased if higher sum rate is desired. Scheme C and Scheme D with r t h,A as the threshold 
{Pout = Pl), which is based on the same design philosophy as Scheme C, have much lower 
and fixed feedback loads as their sum rates grow like (1 — e~ N )M t log \og(K). It can be seen 
that, to achieve the same sum rate, Scheme C requires lower feedback load than Scheme D. 
Note that, based on the design philosophy in 0, Scheme D with constant scheduling outage 
probability behaves similarly as Scheme A and Scheme B. In fact, if the scheduling outage 
probability is set to zero, Scheme D will become exactly the same as Scheme A. When 
Scheme D's P out is large, its sum rate loss is also large. 

Fig. HO] compares the sum rate performance of Scheme C using the optimal and fast bit 
allocation methods discussed in Section [IV] It is shown that the sum rate performances for these 
two bit allocation methods are visually indistinguishable. Thus, the fast bit allocation method is 
preferred for all practical purposes. 



In this paper, we proposed a multi-threshold feedback scheme for the MIMO broadcast 
channel to reduce the aggregate feedback load. The minimum number of regions (thresholds) 
required for a given tolerable sum rate loss was found, and the upper and lower bounds for the 
increment of sum rate with every additional region were derived. The multiuser diversity using 
the multi-threshold scheme was also investigated. Finally, the optimal bit allocation and a fast bit 




VI. Conclusion 
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Fig. 7. Sum rate performance comparison for different feedback schemes, p = 10 dB. 




Number of users 

Fig. 8. Feedback load comparison for different feedback schemes. p=10dB. 

allocation algorithm for the multi-threshold scheme were discussed. Analytical and simulation 
results showed that the proposed multi-threshold feedback scheme can reduce the feedback load 
and utilize the limited feedback bandwidth more effectively than the existing feedback methods. 
In particular, while keeping the aggregate feedback load of the entire system constant regardless 
of the number of users, the proposed scheme almost achieves the optimal asymptotic sum rate 
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Fig. 9. Sum rate as a function of the feedback load, p — 10 dB. 
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Fig. 10. Sum rate performance comparison for different bit allocation methods in Scheme C with N = 4 regions, p — 10 dB. 



scaling with respect to the number of users (i.e., the multiuser diversity). 
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